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1 Introduction 



- T— I ■ 

■ In dCervesato 19981 1. we presented a general methodology for developing a compiler and 



associated intermediate language for any abstract logic programming language (ALPL) 
dMiller et al. 19911 ) that satisfies some basic proof-theoretic properties. We applied it ab- 
stractly to the language of hereditary Harrop formulas and its linear variant, and also based 

the concrete implementations of the Twelf ( [Pfenning and Schiirmann 1999[ ) and LLF ( [Cervesato and Pfenning 2002) 

systems directly on it. This methodology identified right sequent rules that behave like the 

left rules that can appear in a uniform proof and used the corresponding connectives as the 

compilation targets of the constructs in program clauses. The intermediate language was 

therefore just another ALPL and its abstract machine relied on proof-search, like the source 

ALPL. Because the transformation was based on the proof-theoretic duality between left 

and right rules, proving the correctness of the compilation process amounted to a simple 

induction. Finally, for Horn clauses the connectives in the target ALPL corresponded to 

key instructions in the Warren Abstract Machine (WAM) (IWarren 1983l l. The WAM is an 

essential component of commercial Prolog systems since many compiled programs run 

over an order of magnitude faster than when interpreted. 

Up to then, the notoriously procedural instruction set of the WAM was regarded as a 
wondrous piece of engineering without any logical status, in sharp contrast with the deep 
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logical roots of Prolog. In the words of (Borger and Rosenzweig 1995 1 "[the WAM] re- 
sembles an intricate puzzle, whose many pieces fit tightly together in a miraculous way". 
As a result, understanding it was complex in spite of the availability of excellent tutori- 
als dAit-Kaci I99II 1. proving its correctness was a formidable task ( Borger and Rosenzweig 1995 



IRussinoff 1992t . and adapting it to other logic programming languages a major endeavor 
— it was done for CLPiU) (IJaffaretal. 1992t and XProlog jNadathur and Mitchell 1999] ). 
By contrast, the methodology in jCervesato 19981 1 is simple, (mostly) logic-based, easily 
verifiable, and of general applicability. 

The technique in jCervesato 19981 1 had however one blemish: it made use of equality 
over atomic formulas together with a second-order binder over atomic goals, which lacked 
logical status. In this paper, we remedy this drawback by carefully massaging the head of 
clauses. This allows us to replace those constructs with term-level equality and regular uni- 
versal quantifications over the arguments of a clause head. The result is an improved proof- 
theoretic account of compilation for logic programs that sits squarely within logic. It also 
opens the doors to specializing the compilation process to well-moded programs, which 
brings out the potential of doing away with unification in favor of matching, a more effi- 
cient operation in many languages. We present these results for the language of hereditary 
Harrop formulas and only at the highest level of abstraction. Just like jCervesato 19981) . 
they are however general, both in terms of the source ALPL and of the level of the ab- 
straction considered. We are indeed in the process of using them to implement a compiler 
for CLF (IWatkins et al. 20031 ICervesato et al. 2003b . a higher-order concurrent linear logic 
programming language that combines backward and forward chaining. 

The paper is organized as follows: Section|2]recalls the compilation process of jCervesato 19981 I. 
In Section [3] we present our improved compilation process. In Section |4] we refine it to 
support moded programs. We lay out future developments in Sections |5] and |6] 



2 Background and Recap 

In this section, we recall the compilation process presented in jCervesato 1998t . For suc- 
cinctness, we focus on a smaller source language — it corresponds to the language under- 
lying the Twelf system (Pfenning and Schiirmann 1999 1, on which this technique was first 
used. We will comment on larger languages, including those examined in (ICervesato 19981 ). 
in Section[5] 



2.1 Source Language 

We take the language freely generated from atomic propositions (a), intuitionistic impli- 
cation (D) and universal quantification (V) as our source language. We expand the open- 
ended atomic propositions of (ICervesato 19981 ). into a predicate symbol p followed by zero 
or more terms t. A program is a sequence of closed formulas. This language, which we call 
is given by the following grammar: 

Formulas: A ::= a \ Ai A2 \ Vx.A Programs: T ::= ■ | F, A 

Atoms: a ::= p \ at 

As in (ICervesato 19981 ). we leave the language of terms open, but require that it be pred- 
icative (substituting a term for a variable cannot alter the outer structure of a formula). We 



An Improved Proof-Theoretic Compilation of Logic Programs 



3 



Uniform provability 

r,A.r' A > a 
r, A, r' a 



c/x]A 



r — > Ai D A2 



Vx. A 



nediate entailment 



a ^ a 



Ai » a r 



lt/x]A > a 



A2D At > a 



Vx. A > a 



Fig. 1. Uniform Deduction System for 



will often write an atom a as p t, where p is its predicate symbol and t is the sequence of 
terms it is applied to. We implicitly assume that a predicate symbol is consistently applied 
to the same number of terms throughout a program — its arity. We write [t' /x\t (resp. 

for the capture-avoiding substitution of term t' for all free occurrences of variable 
X in term t (resp. in formula A). Simultaneous substitution is denoted \t!_/x\t and [t^/x] A. 

is an abstract logic programming language jMiller et al. 1991 1 ) and, for appropriate 
choices of the term language, has indeed the same expressive power as AProlog jMiller and Nadathur 19861 ) 
or Twelf ( Pfenning and Schiirmann 1999 1. It differs from the first language discussed in JCervesato 19981 1 



for the omission of conjunction and truth (see Section|5]l. 

The operational semantics of is given by the two judgments 

r A A is uniformly provable from T 

r — ^ A ^ a a is immediately entailed by A in T 

Their defining rules, given in Figure [T] produce uniform proofs (IMiller et al. 199lt : the 
uniform provability judgment includes the right sequent rules for C and, once the goal is 
atomic, rule u_atm calls the immediate entailment judgment, which focuses on a program 
formula A and decomposes it as prescribed by the left sequent rules. This strategy is com- 
plete with respect to the traditional sequent rules of this logic (IMiller et al. 199 it . From a 
logic programming perspective, the connectives appearing in the goal — handled by right 
rules — are search directives, while the left rules carry out a run-time preparatory phase. 



2.2 Target Language 

In dCervesato 19981 1. the target language of the compilation process distinguished compiled 
goals (G) from compiled clauses (C). A compiled goal was either an atomic proposition, 
or a hypothetical goal (a goal to be solved in the presence of an additional clause) or a 
universal goal (a goal to be solved in the presence of a new constant). A compiled clause 
had the form Aa. C, where the second-order variable a stood for the atomic goal to be 
resolved against the present clause, while C could either match a with the head a of this 
clause (a = a), invoke a goal (C A G), or request that a variable x be instantiated with 
a term {3x. C). A compiled program ^I* was then a sequence of compiled clauses. The 
grammar for the resulting language, which we call Cq, is as follows: 

Goals: G ::= a \ (Aa.C) dG \ Vx.G Programs: *::=•! "li.Ka.C 
Clauses: C a == a | G A G | 3x. C 

The operational semantics of a compiled program, as given by the above grammar, is 
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Goals 



*,Aa. [a/a]C *,Aq.C^ G c "«ov" *^ [c/x]G 

gO.atm gO.imp gO.all 



Clause instances 

*^c*^G *^ [t/a;]c 

rO.eq rO.and rO.exists 



Fig. 2. Search Semantics of £3- 



defined on the basis of the following two judgments: 

^ G G is uniformly provable from 4* 
^ C C is uniformly provable from 5* 

Here, clause instances C are C's whose variable a has been instantiated with an atomic 
formula a'. The operational semantics of Cq is shown in Figure |2l Observe that, with 
the partial exception of gO_atm, it consists solely of right rules. This means that every 
connective is seen as a search directive: the dynamic clause preparations embodied by the 
left rules has now been turned into right search rules through a static compilation phase. 



2.3 Compilation 

Compilation, the process that transforms a logic program in C into a compiled program 
in £g, is expressed by means of the following three judgments: 

r 5* Program T is compiled to 

A ^ a\C Clause A with a is compiled to C 

A ^ G Goal A is compiled to G 

These judgments are defined by the rules in Figure[3] — see dCervesato 19981 ) for details. 

As our ongoing example, consider the following two clauses, taken from a type checking 
specification for a Church-style simply typed A-calculus. For clarity, we write program 
clauses Prolog-style, using the reverse implication C instead of D in positive formulas. 

1. Aa. 

V£i . V£2 . VTi . VTa . 3Ei.3E2.3Ti.3T2. 

of (app El E2) T2 (of (app Ei E2) T2) = a 

C of El (arr Ti T2) ^ A of £1 (arr Ti T2) 

C of E2 Ti A of E2 Ti 

2. Aa. 
VB.VTi.VTa. 3E.3Ti.3T2. 

of (lam Ti E) (arr Ti T2) ^ (of (lam Ti E) (arr Ti T2)) = a 

C (Vs. of xTi A (Vx. A/3, ((of xTi) = 13) 

D of {E x) T2) D of {E x) T2) 

The compiled language £q is sound and complete for . See dCervesato 19981 1 for the 
formal statements. The proof of both directions proceeds by straightforward induction, 
which contrasts greatly with the complex proofs of soundness and correctness previously 
devised for the WAM (Borger and Rosenzweig 1995 IRussinoff 1992l l. 
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Programs 

r > * A > a\C 

pOc_empty pOc_clause 



B > a\C A » G A » a\C 

cOc_imp cOc.all 



a>Q!\a = Q AdS>C(\CAG \/x. A » a\^x.C 



A :S> a\C B :P G A>C 

gOc.imp gOc.all 



a > a A D B > (Aq. C) D G Vk. A » Vx. C 



Fig. 3. Compilation of C into £g. 



3 Fully Logical Compilation 

Because clauses are compiled to expressions of the form Aa. C, the language £g is not 
fully logical. In this section we consider a different compilation target, the language Ci, 
which lies entirely within logic. 

In the previous section, a generic Horn clause of the form 

Vy- (pi C ai C . . . C a„) (1) 

was compiled into Aa. 3y. {p t ^ a A ai A . . . A a„). During execution, rule cO_atm 
reduced the current atomic goal a to the clause instance 3y.{pt = a A ai A ... A a„). 
Note that t may depend on y, but a does not. We will now compile that Horn clause into 

Vx. {p xC 3y. (x = i A ai A . . . A a„)) (2) 

where x is a sequence of fresh variables, all distinct from each other, and equal in number 
to the arity of p, and x ^ t stands for a conjunction of equalities between each variable 
Xi in X and the term U in t in the corresponding position (or T if the arity of p is zero). 
Notice that the non-logical second-order binder "Aa." is gone. At run time, formula (|2|i 
will resolve an atomic goal p i!_ into the clause p if_<z3y. {t/_ = t A ai A ... A a„), which 
immediately reduces to 3y. {t!_^ t A ai A ... A a„). Like earlier, t may depend on y, 
but i!_ does not. The variables x_ correspond directly to the "argument registers" (An) of the 
WAM ( lAit-Kaci I99II I. while the j/'s are closely related to its "permanent variables" (Yn). 

Formula (|2]l can be understood as an uncurried form of ([T): outer implications are trans- 
formed into conjunctions and universals into existentials. Doing so literally would yield the 
formula p t C 3y. (ai A ... A a„), which is incorrect because occurrences of variables 
in y within t have escaped their scope. Instead, formula ^ installs fresh variables x as the 
arguments of the head predicate p and adds the equahty constraints x = t in the body. 



3.1 Target Language 

We now generalize the above intuition to any formula in not just Horn clauses. Our 
second target language, is given by the following grammar 

Goals: G ::= a \ C D G \ Vx. G Programs: * ::= • | *,C 

Clauses: C ::= RDpx\ \/x.G 
Residuals: R ::— x = t\ T\RAG\ 3x.R 
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Goals 

*,C, *' ^ C » a <l!,C ^ G c "new" * ^ [c/x]G 

gl_atm gl.imp gl.all 



cl.imp cl_all 



R.esiduals 

rl_eq rl.true rl.and rl.exists 

*^i = t *^_RAG 



Fig. 4. Search Semantics of £p 



Compiled goals (G) are just like in Section 12.21 atoms, hypothetical goals, or universal 
goals. Compiled clauses (C) have the form Vx. {R D p x), i.e., a (possibly empty) outer 
layer of universal quantifiers enclosing an implication R D p x whose head p x always 
consists of a predicate name (p) applied to a (possibly empty) sequence of distinct variables 
(x). Its body is a residual (R). A residual can be either an equality constraint (x = t), the 
trivial constraint T (logical truth), or like in Section l2!2l a goal invocation or an instantiation 
request. Notice that C is now the full result of compiling a clause. 

The operational semantics of £j is specified by the following three judgments: 

^ — G G is uniformly provable from ^ 

a a is immediately entailed by C in 
^ — ^> R R is uniformly provable from ^ 

where C and R differ from C and R by the instantiation of some variables in a clause head 
and on the left-hand side of equalities, respectively. 

Their operational semantics is given in Figure |4] Goals are handled exactly in the same 
way as uniform provability in (top part of Figure [Til. The operational reading of com- 
piled clauses is an instance of that of immediate entailment: rule clJmp is a special case 
of i_imp while cl_all is isomorphic to i_all. Note that rule cl_imp reduces immediately 
to the residual R if the head of the clause matches the atomic goal a being proved. The 
rules for residuals correspond closely to the rules for clause instances for our original tar- 
get language at the bottom of Figure|2] rule rl_eq requires that the two sides of an equality 
be indeed equal and rule rl_true is always satisfied. 

The rules in Figure|4]build uniform proofs jMiller et al. 19911) . characteristic of abstract 
logic programming languages: the operational semantics decomposes a goal to an atomic 
formula (top segment of Figure HI, then selects a clause and focuses on it until it finds a 
matching head (middle segment) and then decomposes its body (bottom segment), which 
may eventually expose some goals, and the cycle repeats. In particular, once an atomic goal 
p t has been exposed, a successful derivation will necessarily contain an instance of rule 
gl_atm that picks a clause C with headp x, as many instances of rule cl_all as the arity 
of p, and an instance of rule cl_imp. This necessary sequence of steps is captured by the 
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following derived "macro-rule" (the backchaining mle): 
Vx. {R D px), ^ [t/x\R 

gl_atm' 

*,V£. [R D px),*' ^ pt 

Replacing rules gl_atm, cl_all and clJmp with rule gl_atm' yields a system that is 
equivalent to that in Figure |4] Taking it as primitive amounts to replacing the construction 
for compiled clauses, Vx. {R D px), with a synthetic connective, call it ApX. R. Therefore, 
by accounting for the structure of atomic propositions and proper quantification patterns, 
CI provides a fully logical justification for clause compilation that £q's Aa. C lacked. 

3.2 Compilation 

Compilation transforms logic programs in C into compiled logic programs in . In order 
to define it, the auxiliary notion of pseudo clause will come handy: 

Pseudo Clauses: C ::= O Z) p x \ Vx.C 

A pseudo clause retains the outer structure of a clause, but has a hole (□) in place of the 
residual R. In general, a pseudo clause C has the form Vx. □ D p a/. In a fully compiled 
clause, variables x will coincide with x^. 

Pseudo clauses are generated while processing the head of a clause. The hole then needs 
to be replaced with the compiled body, a residual. We write this operation, pseudo clause 
instantiation, as C[R]. It is formally defined as follows: 

r {aDpx)[R] = RZ)px 
\ {^x.C)[R] = \lx.{C[R]) 

As is often the case with such contextual operations, pseudo clause instantiation can, and 
generally will, lead to variable capture: in (Vx. □ D p x)[R], there may be free occur- 
rences of variables in x_ within R. In the result, these occurrences are bound by the outer 
quantifiers. 

Compilation is expressed by means of the following four judgments 

r S> ^' Program T is compiled to 

x\- a 3> C\E Head a with x is compiled to C and E 

A ^ C\R Clause A is compiled to C and R 

A ^ G Goal A is compiled to G 

and defined by the rules in Figure |5] where we wrote E for conjunctions of equalities. 
The judgment A C\R compiles an clause A into a pseudo clause C and a residual 
R. They are assembled into an CI clause in rules plc_clause and glc_imp. Programs 
and goals are otherwise compiled just as for £g in Figure [3] Clause heads are handled 
differently: rule clc_atm invokes the auxiliary head compilation judgment to compile the 
goal p t into a pseudo clause Vx. □ D p x and the equalities x^t, which will form the 
seed of the clause's residual. 

Consider the first example clause in Section |23] Its head (of (app Ei E2) T2) is com- 
piled into the pseudo clause V.xi. V.X2. (□ D of xi X2) and the equality constraints T A 
(xi = app El E2) A (x2 = T2), where xi and X2 are new variables. These core equahties 
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Programs 

r > * A » C\R 

plc.empty plc.clause 



X x\- a C \ E X "new" 



iEhp>nDp^\T x\- a t \fx.C\E /\ X ^ t 



Clauses 

■ha»C\£; A>GS>C\/?. A»C\i?, 

clc.atm clc.imp clc.all 

a>C\£; AdB>C\/?. AG ^x. A C\^x.R 



A » C\i? B G A > C 

glc_atm glc.imp glc.all 

a » a A D B > C[fl] D G Vk. A » Vk. C 



Fig. 5. Compilation of into C^. 



are then extended with the compiled body of that clause, (of Ei (arr Ti T2)) A (of i?2 Ti), 
and existential quantifications over the original variables of the clause, Ei, E2, Ti and T2, 
are finally wrapped around the result before embedding it in the hole of the pseudo clause. 
The resulting £f clause is displayed in the top part of Figure [l!2] 

The target language £J is sound and complete with respect to . In order to show it, 
we need the following auxiliary results. The first statement is proved by induction on the 
structure of a. The second by induction on the given derivation. 

Lemma 3.1 

• Ifxha 3> C\E, then for all t of the same length as a; and all \1/ we have ^' — '-t- 
[t/x]{C[E]) :$> at. 

• If * ^ C[R] > a, then ^ R. 

The statements of soundness and completeness are as follows. For each of them, the 
proof proceeds by mutual induction on the first derivation in the antecedent. 

Theorem 3.2 {Soundness of the compilation to CD 

• If r A, r » * and A > G, then * ^ G. 

• If r A > a, r > * and A > C\i?, then * ^ C[R] > a. 

Theorem 3.3 {Completeness of the compilation to C\) 

• If * ^ G, r » * and yl > G, then r ^ A. 

• lf^'^G>a, r>*, G = C[R] and A > C \ i?, then F ^ A > a. 

We conclude this section by showing in Figure [J!2] the output of our compilation proce- 
dure for the two examples seen in Section lZH We stretch the source clauses (left) to align 
corresponding atoms. As can be gleaned from these clauses, there are ample opportunities 
for optimizations in our compilation process. In particular, a constraint x ^ y mentioning 
variables on both sides can often be eliminated by replacing the existential variable y with 
the universal variable x in the rest of the clause (and removing the existential quantifier) 
— the exception is when there are multiple constraints of this form for the same y. The 
leading logical constant T makes for a succinct presentation of the compilation process, 
but plays no actual role: it can also be ehminated. 
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1 \_/ \__/ n*i \_/A7~i \__/m 

1. ViJi.Vi52-VTi.Vr2. 


Vxi. Va:2. 


of (app Si £2) T2 


of Xi X2 




, — / — 1 J — 1 — 1 J — I — \m — \m — 1 — 

C (3Si . 3S2 . BTi . 3r2 . T 




A XI = app El E2 


A X2 = T2 


C of Si (arr Ti T2) 


A of El (arrTi T2) 


C of iJ2 Ti 


A of E2 Ti) 


2. VS.Vr1.Vr2. 


\fx1.\fx2. 


of (lam Ti S) (arr Ti T2) 


of Xi X2 




C {3E.3Ti.3T2.T 




A xi = lam Ti E 




A X2^ arr Ti T2 


C (Vx. ^ 


A (Vs. Vs'i.V^.T 


A x'l = X 




A x'2 = Ti 


of X Ti 


A of x'l X2) 


D of (S x) T2) 


D of (S x) Ti) 



Fig. 6. Compilation Example 



It is interesting to rewrite these clauses using the synthetic connective Ap discussed 
earher (we have omitted occurrences of T for readability): 

Aofxixs. 3Ei.3E2.3Ti.3T2. 

xi = app El E2 A X2^ T2 
A of El (arr Ti T2) A of E2 Ti 

Aofxixs. 3S.3r1.3T2. 

xi = lam Ti £; A X2 ^ arr Ti T2 
A Vx. (Aof x'l xj. x'l^x A = Ti) D of(£;a;)T2 



4 Support for Moded Programs 

In this section, we will specialize the compilation process just outlined to the case where the 
source program is well-moded. In a well-model program, the argument positions of each 
predicate symbol are designated as either input or output. Input arguments are guaranteed 
to be ground terms at the time a goal is called. Dually, output arguments are guaranteed to 
have been made ground by the time the call returns. 

There are operational benefits to working with well-moded programs; while an inter- 
preter for a generic program must implement term-level unification, well-moded programs 
can be executed by relying uniquely on pattern matching and variable instantiation. This is 
desirable because matching often behaves better than general unification. For example, it 
is more efficient for first-order term languages were it only because it does away with the 
occurs-check, and it is decidable for higher-order term languages while general unification 



is not ( Stirhng 2009 1 



The development in this section is motivated by well-moding, but is sound independently 
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of whether a program is well-moded or not. Statically enforcing well-moding brings the 
operational advantages just discussed, but the results in this section do not depend on it. 



4.1 Source Language 

In this section, we assume that each predicate symbol in comes with a mode which 
declares each of its arguments as input, written or output, written For simplicity of 
exposition, we decorate the actual arguments of all atomic propositions with these symbols, 
so that a term t in input position in an atomic proposition is written t (read "in i"). Similarly 
t in output position is written t (pronounced "out t"). This amounts to revising the grammar 
of atomic propositions as follows: 

Atoms: a ..= p \ at \ at 

Just like we assume that the arity of a predicate symbol p remains constant in a program, 
we require that all atomic propositions for p have their input/output marks in the same 
positions. This pattern is the mode of p — an actual language would rely on explicit mode 
declarations. 

For typographic convenience and without loss of generality, our examples assume that 
input positions precede output positions so that an atomic formula a can be written as 
pit where t and t are the (possibly empty) sequences of terms in input (resp. output) 
positions for p. To avoid notational proliferation, we use the markers " and " both as mode 
designators and as symbol decorations (like primes and subscripts) when working with 
generic terms. Therefore, t and t indicate possibly different terms in pit, and similarly for 
term sequences, as in p i t above. 

At our level of abstraction, the rules in Figure[T]capture the operational semantics of this 
variant of C^: mode annotations are simply ignored. However, moded execution requires 
that two of the operational choices left open by those rules be resolved using some algo- 
rithmic strategy: the order in which rule i_imp searches for derivations of its two premises, 
and the substitution term that rule i_all picks. For both, we will assume the same strategy 
as Prolog: implement rule i_imp left to right and implement rule i_all lazily by replacing 
each variable x with a "logical variable" X which is instantiated incrementally through 
unification. This allows us to view an atomic goal as a (non-deterministic) procedure call. 



In a well-moded program (Debray and Warren 1988 1, terms in input position are seen as 



the actual arguments of this procedure, and terms in output position yield return values. 



In this section, we will not formalize the notion of well-modedness — see (Debray and Warren 1988 1 
for Prolog and (ISarnat 201 Oi l for Twelf — nor refine our operational semantics to make goal 
evaluation order and unification explicit — see jPientka 2003l l. We will instead refine our 
compilation process to account for mode information and produce compiled programs that, 
if well-moded, can be executed without appealing to unification. 



4.2 Target Language 

In £f, a (well-moded) Horn clause ^y-ptt C ai C ... C a„ was compiled into 
\/x X. {p X X C 3y. {x = i/\x_ = t/\ ai A ... A a„)). Here, the left-to-right exe- 
cution order forces us to guess the final values of the output variables x before the goals 
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in its body have been fully executed. In we will move the equality x = t after the 
last goal a„ - Since x appear nowhere else in the residual, this equality is no more than an 
assignment of the computed instance of t to x. Accordingly, we will write it as i := i. 
Furthermore, in a well-moded program, this clause will be invoked with ground terms in 
input position, so that x will be bound to ground terms. Then, the input equality x = t 
will match the variables in i with appropriate subterms. For this reason, we will write it as 
X =: i. Expanding each goal into qi t^, the above clause will be compiled (almost) as 
follows, where the arrows represent the data flow of a well-moded execution (note that it 
parallels the control flow): 

: A 

\/xx.{pxx C {3y. x=:t A qitiii A ... A qntnin ^ 3L'-~t)) 



When executing an atomic goal, it is desirable to separate the call from the verification 
that the output terms returned by the caller match the expected output terms in this goal. 
We will do so by rewriting any atomic goal g f f in a compiled clause into the formula 
3z. (g i z A z =: for fresh variables z. This transformation preserves the left-to-right 
control and data flow. No special provision needs to be made for the input arguments of 
q as variables in it will have been instantiated to ground terms at the moment the call is 
made. 

Next, we again generalize this intuition to any formula in , not just Horn clauses. Our 
third target language, C2, is defined by the following grammar. 

Goal Matches: M T | M Kz=:i Programs: * • | *,C 

Atomic Goals: F y.^ ptz A M \ 3z.F 
Goals: G F | C D G | Vx. G 

Clauses: C RDpxx \ Vx. G 
Residuals: R ::= x =: t \ x := t \ T | RAG \ 3x.R 

Residuals (R) refine the equality predicate x = t of into a matching predicate x =: t 
and an assignment predicate x := t. At our level of abstraction, they behave just like 
equality. During well-moded execution, the match predicate will have the form tg =: ti, 
where tg is a ground term while may contain variables. It will bind these variables 
to ground subterms of tg, thereby realizing matching. However, presented with programs 
that are not well-moded, the terms tg cannot be assumed to be ground and =; performs 
unification. The assignment predicate will be called as x := t where a: is a variable and t a 
term — a ground term for well-moded programs. It simply binds x to t. Compiled clauses 
and programs are just like in £J. 

Following the motivations above, an atomic goal p ttis not compiled any more to itself 
as in CI, but to a formula F of the form 3z_. {q tz A z_ =: t). In the grammar above, we 
isolated the match predicates z_=: tas the non-terminal M. 
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Goals Matches 



m2_mtch 

M A t =: t 



Atomic Goals 



l-.C, *' ^ C > Pi* M 'i [t/z]R 

a2_atm i 



*,C. ^ p it A M 



yj, ^ F *,C^ G c"new" * ^ [c/x]G 

g2.f g2.imp g2.all 



* ^ _R * -A [t/a;]C > a 

c2_imp c2_all 



R.esiduals 

r2_mtch 



' t ^: t t -.^ t 

r2_and r 



Fig. 7. Search Semantics of £2- 



We specify the operational semantics of C2 by means of the following five judgments: 

— ^ M M is provable 

"if — F F is uniformly provable from ^ 

^' — ^ G G is uniformly provable from ^ 

a a is immediately entailed by C in 

^ — ^ R R is uniformly provable from ^ 

which parallel the grammar just presented. The resulting operational semantics is shown 
in Figure |7] The rules for clauses are unchanged with respect to Ci while that language's 
residual rule for equality has been duplicated into isomorphic rules for matching and as- 
signment. The rules for compiled goals have instead proliferated due to our handling of 
terms in output position in atomic goals. Observe that rule a2_atm is essentially a combi- 
nation of rule gl_atm in CI and the rule for conjunction. Rules a2_exists and m2_true 
are just the standard rules for existential quantification and truth. Rule m2_mtch combines 
the rules for conjunction and matching. 

Just like in the case of CI , the rules in Figure|7]construct proofs that are uniform ( IMiller et al. 199ll l 
which makes £3 ™ abstract logic programming language. In a successful derivation, this 
operational semantics decomposes a goal to formulas of the form F = Elz. {ptz_ A z_=: i) 
(rules in the "Goals" segment). Then, rules a2_exists, m2_mtch and m2_true neces- 
sarily reduce it in a few steps into the atomic formula pti. Similarly to £f , the left premise 
of rule a2_atm selects a clause and focuses on it until it finds a potentially matching head 
("Clauses" segment). It then proceeds to decomposing its body ("Residuals" segment) and 
the cycle repeats with whatever goals it finds in there. 
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As just noticed, any atomic goal F of the form 3z. {piz_ A i =: i) is necessarily reduced 
to ptt by as many applications of rule a2_exists as there are variables in z_, a pass-through 
instance of a2_atm via its right branch, and a similar number of uses of rules m2_mtch 
and m2_true respectively. This entails that the macro-rule a2_atm', on the left-hand side 
of the following display, is derivable: 

•^^pii ^,c,^' ^ c :^ pit 

a2_atm' a2_atm" 

Having factored rule a2_atm' out, the work performed by a2_atm degenerates to rule 
a2_atm" on the right-hand side of the above display, which is akin to u_atm. The system 
obtained by replacing the m2_* and a2_* rules as well as g2_f with rules a2_atm' and 
a2_atm" is indeed equivalent to the rule set in Figure |7] 

Rule a2_atm' entices us to interpret the compiled formula 3z. (piz A z =: f) for an 
atomic goal pi t as a synthetic operator caW p t =: t which invokes a clause for p with its 
(ground) input arguments i and matches the returned values against its terms t in output 
position. 

Having recovered atomic goals pit through rules a2_atm' and a2_atm", we can carry 
out a sequence of reasoning steps similar to what led us to the backchaining rule for £j. 
Exposing the trailing assignments, a generic compiled clause C has the form Vxi. {3y. R A 
X := s) Z) p x_x.\n a successful derivation, all rule a2_atm" does is to pick such a 
clause. Then, applications of rule c2_all will instantiate variables x x with the terms t i, 
and next rule c2_imp will invoke the instantiated residual [i/x,t/x]{3y. R A x := S). 
Now, because x does not occur in R and x x cannot appear in s, this formula reduces to 
^y- {[t/x]R> A < := s) by pushing the substitution in. Rule r2_exists will then instantiate 
the variables y with terms u (which cannot mention variables xx). Pushing this substitution 
in yields the formula [i/x, u/y]R A i [u/y]i. since variables in y can occur in neither f 
nor t. Finally, by rule r2_assg, t and [u/y]s must be equal in a successful derivation. This 
necessary sequence of steps is captured by the following derived backchaining macro-rule, 

[i/x,u/y]R 

g2_atm' 

^,yxx. {3y. R A i := 1) D pxx,'^' p i 



where we have carried out the assignment t [u/y]s_ in the conclusion. This rule can be 
seen as a refinement of gl_atm' in CI that makes use of the trailing assignment in the 
compiled clauses of C^- With this derived inference, rules a2_atm", c2_imp and c2_all 
become unnecessary: the system consisting of rules a2_atm', g2_atm', the goal rules 
for implication and universal quantification, and the residual rules is equivalent to that in 
Figure I2I 

Taking rule g2_atm' as primitive amounts to replacing compiled clauses with the fol- 
lowing synthetic connective, which refines Ci 's KpX. R. 

\/xx.pxxCZ 3y.(R A £ 

ApX. 3y.{R ; return |) 
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Programs 

p2c_empty p2c_clause 

■ » ■ r, A > *.c[i? A o] 



x^l-a>CV/\0 a; "new" xxha:$>C\I\0 x "new" 



xh at :S> yx.C \ I A X =: i \ O xh a t \fx.C \ I \ X := t A O 



Clauses 

■l-a>C\/\0 A>GB>C\/?\0 A>C\i?\0 



c2c.all 



Atomic goals 

it\- a J^\M tz\-a:^C\M z -'new" 

a2c_p a2c_in a2c_ot 



Goals 

g2c.imp 



a > J^IM] A D B > C[fl A O] D G Vx.A » Vk.C 



Fig. 8. Compilation of C into £2- 



The variables y are then interpreted as local variables for the execution of this clause. In 
this, they are akin to the Yn permanent variables of the WAM (lAit-Kaci 1991b . 

In a valid proof in this system, an occurrence of a2_atm' is always immediately fol- 
lowed by an instance of g2_atm': the conclusion of the latter must match the premise of 
the former. This fact realizes the requirement that, upon returning from a call, the output 
terms, here [u/ y]s, must be checked against the terms in output position of the caller 



4.3 Compilation 

Compilation transforms logic programs in to compiled programs in C^ - The input does 
not have to be well-moded at the level of detail considered here, but this would be opera- 
tionally advantageous in a refinement of the semantics in Figure|2]that handles quantifiers 
lazily. We will make use of two auxiliary notions in this section: pseudo clauses that we 
encountered already in Section l372l and the analogous notion of pseudo atomic goal. They 
are defined as follows; 

Pseudo Clauses: C ::= ODpx\ \fx.C 
Pseudo Atomic Goals: J- ::= piz, A □ | 3z.J- 

Just like pseudo clauses retain the outer structure of a clause replacing the embedded resid- 
ual with a hole (□), pseudo atomic goals have a hole in place of their trailing matches. The 
general form of pseudo clauses and pseudo atomic formulas, accounting for input and out- 
put positions, are Vi i. □ D p xLsf. ™d 31. {piz!. A □). In Section l3?2l wrote C[R] for 
the replacement of the hole of C with the residual R and noted that variable capture could 
(and generally will) occur Similarly, we write J^[Af ] for replacement of the hole of T with 
matches AI. 

The compilation process is modeled by the following five judgments, which are reminis- 
cent of the compilation judgments C^. They are more complex because clause compilation 
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now needs to handle both matching and assignment as opposed to a generic equality. Fur- 
thermore, a new judgment is needed to compile atomic goals. 



We write / and O for a conjunction of matches (compilation of terms in input position) and 
assignments (compilation of output terms), respectively, in the body of a compiled clause. 
In compiled atomic goals, we write M for a conjunction of matches. 

The rules for compilation, which define these judgments, are shown in Figure [8] Com- 
piling a clause A, modeled by the judgment A ^ C\R \ O, returns a pseudo clause 
C, the residual R (inclusive of input matches) and the output assignments O that will fill 
its hole. The rules in the "Clauses" segment build up this residual starting with the com- 
pilation of its head, which is displayed in the "Heads" segment. The rules therein differ 
from the similar inference for by the fact that they dispatch terms in input and output 
positions in the / and O zones of the judgment as matches and assignments respectively. 
Residuals and assignments are plugged in the hole of the pseudo clause once this clause 
has been fully compiled, as can be seen in the "Programs" segment and in rule g2c_imp. 

The compilation of goals differs from CI for the treatment of atomic formulas; upon 
encountering an atom a, the compilation appeals to the new judgment • h a ;» J-\M. 
It generates a pseudo atomic formula T and matches M, which are integrated in rule 
g2c_atm. The zone to the left of the turnstile serves as an accumulator, very much like 
when compiling heads. 

Target language, C2, is sound and complete with respect to C . The following lemma 
collects some auxiliary results needed to prove this property. The first two statements are 
proved by induction on the structure of a; the third by induction on the given derivation. 



• \f x\- a ^ C \ I \ O, then for any term sequence t of the same length as x and 
program we have * [t/x]{C[I A O]) > at. 

• If t h a > J'\ A/, then for all * we have * ^ T[M] :^ at. 

• If * ^ C[R] > a, then ^ ^ R. 

We have the following soundness and completeness theorems for £2- In both cases, the 
proof proceeds by mutual induction over the first derivation in the antecedent. 

Theorem 4.2 (Soundness of the compilation to £2) 

• If r ^ A, r > * and A > G, then ^ G. 

• Ifr^A>a, r>*and^>C\i?\0, then * ^ C[i? A O] > a. 

Theorem 4.3 (Completeness of the compilation to C2) 

• If ^' ^ G, r > * and A > G, then T ^ A. 

• If * ^ G > a, r > C = C[R A O] and A > C \ i? \ O, then 
r ^ A > a. 

To conclude this section, we revisit our ongoing examples. Here, we assume that the 



r > * 

x^ a C \ I \ 
A :$> C\R\0 
fr a » T\AI 
A > G 



Program F is compiled to "i! 
Head a with x is compiled to C, I and O 
Clause A is compiled to C, R and O 
Atomic goal a with t is compiled to J- and M 
Goal A is compiled to G 



Lemma 4.1 
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1. VSi.ViJa.VTi.VTa. 

of (app El Ei) Ti 



of X'l X2 



C {3Ei.3E2.3Ti.3T2.T 



C of El (arr Ti Ta) 
C of Ti 



> 



A 2:1 app El E2 

A 3zi. (of El zi A zi 

A 3Z2. (of E2 Z2 A Z2 

A X2 ■- T2 A T) 



arr Ti T2 A T) 

Ti A T) 



2. V£;.vri.vr2. 

of (lam Ti £;) (arr Ti Ta) 



yxi.wx2. 

of Si 2;2 

C (3E.3Ti.3T2.T 

A a;i lam Ti £ 

A 32. ((Vx. ( Vx'i.Vx^.T 



C {Vx. 



of a: Ti 



> 



A x'l =: 2; 
A of x'l x'2 
A x'2 ■- Ti A T) 



D of (S x) Ta) 



D of (£ X-) 2) 
Az=:T2 AT) 
A 2:2 := arr Ti T2 A T) 



Fig. 9. £2 Compilation Example 

mode of the predicate of is of " — the first argument is input and the second output. The 
result of compiling our two familiar clauses into £3 is shown in Figure|9] As in Section [l!2l 
the moded compilation process offers ample opportunities for optimization: matches and 
assignments with variables on both side and the corresponding existential quantification 
can often be elided, and all occurrences of T can be optimized away. 

It is instructive to rewrite these clauses with the two synthetic connectives introduced 
earlier for again omitting T for readability: 



Aofcci. 3Ei.3E2.3Ti.3T2. xi =: app Ei E2 

A call(ofi;i) =: (arr Ti T2) A call (of i;2) =: Ti; 
return T2 

Aof xi. 3E. 3Ti. 3T2. xi =: lam Ti E 

A Vx. (Aof x'l. x'l =: a; ; return Ti) D caW {of {E x)) =: T2; 
return (arr Ti T2) 



In (ICervesato 19981 1. we illustrated our original abstract logical compilation method on the 
language of hereditary Harrop formulas. This language differs from for the presence of 
conjunction (formulas of the form AAB) and truth (T). While our original treatment could 
handle them easily (in a clause position, they were compiled to disjunctions and falsehood 
respectively), the approach taken in Sections [3] and 2] does not support them directly. The 
problem is that, as soon as we allow these connectives, clauses can have multiple heads (or 
even none). Consider for example: 



5 Larger Source Languages 



Vx. Vy. q X y D (pi X y A {r x y D P2 x)) 
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This clause has two heads: pi x y and p2 x. What should it be compiled to? To ensure 
immediacy (embodied in the macro-rule gl_atm'), our compilation strategy produces a 
pseudo clause applied to a residual, thereby exposing the (flattened) head of a compiled 
clause as close to the top level as possible. How to achieve this now that there may be more 
than one head? 

One approach to dealing with this problem is to observe that A distributes over (the 
antecedent of) D and V. By doing so to the above example, we obtain the formula 

(Vx. Vy. q X y D pi X y) A {\/x. \/y. qxy'DrxyDp2x) 

Observe that it is a conjunction of C clauses. Each of them can now be compiled as 
in Section [3] and the results can be combined by means of a disjunction. This approach 
generalizes to the full language of hereditary Harrop formulas. It pushes the conjunctions 
to the outside, leaving inner formulas resembling the clauses of Cq (conjunction and truth 
in a goal position are left alone as they are not problematic). Clauses with no head (e.g., 
A D T) are reduced to T. These preprocessing steps can be implemented as a source-code 
transformation or integrated in the compilation process. 

The other abstract logic programming language examined in dCervesato 19981 1 is the lan- 
guage of Unear hereditary Harrop formulas, found at the core of LoUi (IHodas and Miller 1994| l 



and LLP (Cervesato and Pfenning 2002 1. The improved compilation process discussed in 
this paper extends directly in the presence of linearity. Because linear hereditary Har- 
rop formulas feature a form of conjunction and truth, the technical device just outUned 
is needed to obtain workable compiled clauses. 



6 Future Work 

The discussion in Section |4] sets the stage for a nearly functional operational semantics 
of well-moded programs. Indeed, given an atomic goal with ground terms in its input po- 
sitions, proof search will instantiate its output positions to ground terms, if it succeeds. 
Being in a logic programming setting, more than one answer could be returned. Indeed, for 
well-moded programs, the clauses for a predicate implement a partial, non-deterministic 
function. This observation informed the choice of the notation for the synthetic operators 
we exposed: caW pi =: t and ApX. 3y. {R; return t). 

Now we believe that, in the case of well-moded programs, a more detailed operational 
semantics that exposes variable manipulations using logical variables and explicit substi- 
tutions (and restricts the execution order) can bring this functional interpretation to the sur- 
face. This would provide a logical justification for the natural impulse to give well-moded 
programs a semantics that is typical of functional programming languages, where atomic 
predicates carry just input terms and from which the terms in output position emerge by a 
process of reduction. 

In future work, we intend to carry out this program by giving such a detailed operational 
semantics to C as well as well-moding rules. The goal will then be to perform logical 
transformations, akin to what we did in this paper, that expose this functional semantics 
for well-moded programs. It would also allow us to prove formally that the operator =: of 
Section|4]can indeed be implemented as matching rather than general unification. 
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